Uniform clip sampling + multi-clip sampling function#57
Uniform clip sampling + multi-clip sampling function#57themattinthehatt wants to merge 13 commits into
Conversation
sfmig
left a comment
There was a problem hiding this comment.
Hi @themattinthehatt, thanks for this!
I made a few suggestions: two of them are two options to try to keep the kwargs within static type checking; the other one is a decoupling suggestion that I think can help simplify tests and has other benefits too.
I think the decoupled approach would be my preferred one in terms of clarity, but it may be a personal preference too. I leave you all the options, let me know what you think. I requested changes in case you want to go for any one of them, I'd be happy to have another look. I haven't checked the tests in detail for the same reason.
| if args.sampling == "uniform": | ||
| if args.num_clips is None: | ||
| raise SystemExit( | ||
| "error: --num_clips is required when --sampling uniform" | ||
| ) | ||
| kwargs["num_clips"] = args.num_clips | ||
| elif args.sampling == "manual": | ||
| if not args.start_frames: | ||
| raise SystemExit( | ||
| "error: --start_frames is required when --sampling manual" | ||
| ) |
There was a problem hiding this comment.
These checks are enforced twice: once in main and once in extract_clips. It makes sense to have a CLI-friendly error (i.e. no traceback, to report wrong inputs) and an API-friendly one (with traceback) but we can reduce code duplication a bit more with the suggestion below, that "translates" the API error to a CLI one.
(It assumes the first suggestion for the extract_clips signature above was applied)
def main(args: argparse.Namespace) -> None:
try:
extract_clips(
args.video_path,
args.duration,
args.sampling,
num_clips=args.num_clips,
start_frames=args.start_frames,
)
except ValueError as e:
raise SystemExit(f"error: {e}")Or if we go for the dataclass approach, the purpose of main shifts to instead catch None input cases (rather than simply translate the error):
def main(args):
if args.sampling == "uniform":
if args.num_clips is None:
raise SystemExit("error: --num_clips is required when --sampling uniform")
sampling = Uniform(num_clips=args.num_clips)
# or for the decoupled approach we can directly have:
# extract_clips_uniform(args.video_path, args.duration, args.num_clips)
elif args.sampling == "manual":
if not args.start_frames:
raise SystemExit("error: --start_frames is required when --sampling manual")
sampling = Manual(frames=args.start_frames)
# or for the decoupled approach we can directly have:
# extract_clips(args.video_path, args.duration, args.start_frames)
# the following call would not be needed in the decoupled approach
extract_clips(args.video_path, args.duration, sampling)There was a problem hiding this comment.
I like the first option better; the dataclass approach feels overly complicated to me (at least at this point)
| def extract_clips( | ||
| video_path: Path, | ||
| duration: int, | ||
| sampling: str, | ||
| **kwargs: Any, |
There was a problem hiding this comment.
Ideally, the function signature would be explicit about the different keyword arguments, rather than managing this via the docstring. That way we don't lose the static type checking on those inputs (i.e. right now mypy won't flag num_clips if it is not an int).
One option to do this would be to define all possible inputs and set them to None by default, then branching based on that:
def extract_clips(
video_path: Path,
duration: int,
sampling: Literal["manual", "uniform"],
*,
num_clips: int | None = None,
start_frames: list[int] | None = None,
) -> list[tuple[Path, Path | None]]:
if sampling == "manual":
if start_frames is None:
raise ValueError("sampling='manual' requires start_frames")
frames = list(start_frames)
elif sampling == "uniform":
if num_clips is None:
raise ValueError("sampling='uniform' requires num_clips")
n_frames = sio.load_video(Path(video_path)).shape[0]
frames = _uniform_start_frames(num_clips, duration, n_frames)
else:
raise ValueError(f"Unknown sampling strategy {sampling!r}")
return [extract_clip(video_path, sf, duration) for sf in frames]Claude also suggested an alternative approach using dataclasses that may be more convenient if we expect the number of sampling strategies/input parameters to grow. Something like the snippet below:
from dataclasses import dataclass
@dataclass
class Uniform:
num_clips: int
def start_frames(self, duration: int, n_frames: int) -> list[int]:
return _uniform_start_frames(self.num_clips, duration, n_frames)
@dataclass
class Manual:
frames: list[int]
def start_frames(self, duration: int, n_frames: int) -> list[int]:
return list(self.frames)
Sampling = Uniform | Manual
def extract_clips(
video_path: Path, duration: int, sampling: Sampling
) -> list[tuple[Path, Path | None]]:
n_frames = sio.load_video(Path(video_path)).shape[0]
frames = sampling.start_frames(duration, n_frames)
return [extract_clip(video_path, sf, duration) for sf in frames]To call it, you would do extract_clips(p, 100, Uniform(num_clips=5)). This one feels a bit more complex but has two nice benefits:
- you cannot make a
Uniforminstance withoutnum_clips, so we can do away with the "API" validation / error handling - if we add more sampling strategies (which I think we will), we don't need to touch
extract_clips
There was a problem hiding this comment.
I agree about being explicit with keyword args for improved type hints/checking, good idea 👌
| return [extract_clip(video_path, sf, duration) for sf in start_frames] | ||
|
|
||
|
|
||
| def _uniform_start_frames( |
There was a problem hiding this comment.
The current approach in extract_clips bundles together input preparation (i.e. computing start_frames based on the sampling strategy) and the action (i.e. the extraction of the frames). The same applies for the suggestions I gave. This makes a more convenient function call, but the coupling of these two objectives has some downsides.
We can decouple it (initially this is what I had in mind). The extract_clips function can take start_frames as input, and then we have several functions that generate these start frames (e.g., the current _uniform_start_frames). For convenience, we can have wrappers that we can wire to the CLI. These wrappers combine extract_clips with a start-frame-production function:
# "core" function, only extracts
def extract_clips(
video_path: Path,
duration: int,
start_frames: list[int],
) -> list[tuple[Path, Path | None]]:
# potentially some validation here
return [extract_clip(video_path, sf, duration) for sf in start_frames]
# wrapper, prepares the input. We link this to the CLI
def extract_clips_uniform(video_path, duration, num_clips):
n_frames = sio.load_video(video_path).shape[0]
frames = _uniform_start_frames(num_clips, duration, n_frames)
return extract_clips(video_path, duration, frames)This decoupled approach has a few benefits:
- simpler and smaller unit tests (because each function does one thing)
- it scales nicely to other strategies (contributors need to add a "start-frames" function and then write the thin wrapper that collects the call;
extract_clipsstays the same) - the dispatch that translates a string (
uniform) into a function (_uniform_start_frames) now occurs at the CLImain(it is removed entirely from here). We would need to keep the None-guards inmainas mentioned above tho.
I think it would also help with the validation of inputs:
- Right now,
start_framesvalues aren't validated as a list of positive integers anywhere (each frame is only checked individually inside the loop that callsextract_clip, which is late to catch the error). This is easy to miss with the current approach I think, because validation is a bit spread across different places. The start frame validation would belong nicely within the decoupledextract_clips, and that way we can report all bad frames at once instead of dying on the first. - Claude suggests to use a small validation function, to use in
extract_clipandextract_clips. But in the multi-clip approach, we would validate the inputs up-front.
def _validate_clip_request(start_frame: int, duration: int) -> None:
if start_frame < 0:
raise ValueError(f"start_frame must be non-negative, got {start_frame}")
if duration <= 0:
raise ValueError(f"duration must be positive, got {duration}")There was a problem hiding this comment.
P.S.: Claude says "argparse can enforce the conditional requirement via subcommands". It would look something like this:
extract-clips uniform --video_path v.mp4 --duration 100 --num_clips 5
extract-clips manual --video_path v.mp4 --duration 100 --start_frames 0 500 1000I have never done this, and I think I prefer the argparse to have no "important" logic, but just in case you want to look into it.
There was a problem hiding this comment.
another great suggestion @sfmig! I also agree that I don't want argparse performing this logic, it's simple in the manual case but we'll likely have more/more complex logic checks for other sampling strategies, so this should be handled in extract_clips_*.
| "Supported strategies: 'manual', 'uniform'." | ||
| ) | ||
|
|
||
| return [extract_clip(video_path, sf, duration) for sf in start_frames] |
There was a problem hiding this comment.
Maybe to avoid confusion between both functions:
| return [extract_clip(video_path, sf, duration) for sf in start_frames] | |
| return [extract_single_clip(video_path, sf, duration) for sf in start_frames] |
|
@sfmig thanks for the great feedback! I incorporated many of you suggestions. I also imagine at some point we'll build a proper CLI module and move I have one test failing: https://github.com/neuroinformatics-unit/poseinterface/actions/runs/27362615831/job/80853717159?pr=57. Do you know why this is the case? I didn't touch this module on my side, is this related to some update in |
Ah yeah, that's because of the breaking change of dimension names in movement v0.17.0. To unblock the ongoing PRs, we could temporarily pin movement to v0.16.0 here and then we can follow up with a separate PR bringing the codebase up-to-speed with |
|
@themattinthehatt, I've quickly fixed the If you give #59 a quick approval, we can merge that first and then rebase your 2 PRs to |
Description
What is this PR
Why is this PR needed?
Current repo only has a single clip selection strategy (
manual).What does this PR do?
Adds a
uniformclip sampling strategy option, and updates the CLI to take sampling strategy as input.References
Builds on #39.
How has this PR been tested?
Added new unit tests, checked all tests are passing locally before making PR.
Is this a breaking change?
Yes; the previous clip extraction entry point was
extract-clip; this has been updated toextract-clipsto work with multiple clips; in addition, the command line args changed to allow for multiple clip sampling strategies.Does this PR require an update to the documentation?
Minor update renaming the clip extraction entrypoint from
extract-cliptoextract-clips.Checklist: